The Questions
- “I’m creating a scalar function that will be used against a column that contains multiple NULL values. Is there a way to stop the function from being executed when a NULL is passed in as a parameter value?”
- “I created a table-valued function that I want to apply to a column in each row returned by the query. However, when I try to join the function to the target table in the query’s FROM clause, I receive an error. Is there an effective way to invoke the function for each row in the result set?”
- “I’ve heard you can use a function to parameterize a view, but I don’t see how you can incorporate a function into a view definition in such a way to support parameters. Can you explain how that is done?”
- “I’m creating a user-defined function and want to specify a default value for an input parameter. Is that possible?”
- “I created a scalar function that’s used in multiple queries, some of which can return millions of rows. Ever since I incorporated the function into the queries, performance has nosedived. Are there steps we can take to fix this?”
- “I want to call the GETDATE system function from within a user-defined function, but I’ve read you cannot do this. Is there a workaround that lets me use the GETDATE function?”
- “I want to execute a stored procedure from within a user-defined function, but I keep running into errors in the syntax. How do I call a stored procedure from within a function?”
- “When creating user-defined functions, I like to include the SCHEMABINDING option, but in some cases, I receive an error when creating the function, saying that I cannot schema bind the function. Any idea what’s going on?”
- “I’ve seen user-defined functions called in different ways in code. Is there a proper way to call a function?”
- “I’m working on a query whose T-SQL code I want to encapsulate and parameterize. The query will aggregate a subset of values based on an input parameter and return the aggregated value as a column in the query results. Should I create a stored procedure or a user-defined function?”
“I’m creating a scalar function that will be used against a column that contains multiple NULL values. Is there a way to stop the function from being executed when a NULL is passed in as a parameter value?”
Yes. In fact, doing so is actually a very straightforward process, at least for scalar functions. You simply include the RETURNS
NULL
ON
NULL
INPUT
option in your WITH
clause, as shown in the following example (option highlighted):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
USE AdventureWorks2012; GO IF OBJECT_ID(N'dbo.fnGetTotalItems', N'FN') IS NOT NULL DROP FUNCTION dbo.fnGetTotalItems; GO CREATE FUNCTION dbo.fnGetTotalItems (@OrderID INT) RETURNS INT WITH RETURNS NULL ON NULL INPUT, SCHEMABINDING AS BEGIN DECLARE @TotalItems INT SELECT @TotalItems = SUM(OrderQty) FROM Sales.SalesOrderDetail WHERE SalesOrderID = @OrderID GROUP BY SalesOrderID RETURN @TotalItems; END; GO |
By default, when you call a scalar function, the database engine executes the function body whether or not a NULL
is passed in as a parameter value. However, by including the RETURNS
NULL
ON
NULL
INPUT
option, the database engine will not execute the function body when a NULL
value is passed in. For example, the following SELECT
statement returns a NULL
value:
1 |
SELECT dbo.fnGetTotalItems(null); |
When the database engine sees the NULL
input value, it simply returns NULL
without processing the function body. If your function supports multiple input parameters, the database engine returns NULL
if NULL
is passed into any one of those parameters and does not execute the function body.
However, you cannot use the RETURNS
NULL
ON
NULL
INPUT
option for a table-valued function. Because a table-value function returns a full resultset (table), it is possible for the function to return data even if a parameter value is NULL
. As such, the option cannot be practically applied to a table-valued function.
“I created a table-valued function that I want to apply to a column in each row returned by the query. However, when I try to join the function to the target table in the query’s FROM clause, I receive an error. Is there an effective way to invoke the function for each row in the result set?”
There is as long as you’re using SQL Server 2005 or later. Starting with SQL Server 2005, you’ve been able to use the APPLY
operator to join one or more tables to a table-valued function in order to invoke that function against each row in the resultset. Prior to SQL Server 2005, you had to come up with a complex workaround to achieve this.
The best way to understand how the operator works is to look at an example. The following T-SQL creates a table-valued function that returns the total number of items sold for each sale listed in the SalesOrderDetail
table of the AdventureWorks2012
database:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
USE AdventureWorks2012; GO IF OBJECT_ID(N'dbo.ifGetTotalItems', N'IF') IS NOT NULL DROP FUNCTION dbo.ifGetTotalItems; GO CREATE FUNCTION dbo.ifGetTotalItems (@OrderID INT) RETURNS TABLE WITH SCHEMABINDING AS RETURN ( SELECT SUM(OrderQty) AS TotalItems FROM Sales.SalesOrderDetail WHERE SalesOrderID = @OrderID GROUP BY SalesOrderID ); |
The function takes as an argument the SalesOrderID
value that identifies the sale. In this case, the function returns only a single row with one column, but you can just as easily create a function that returns multiple rows. The simplest way to verify that the function is working as you expect is to run a simple SELECT
statement that calls the function and passes in a SalesOrderID
value:
1 |
SELECT TotalItems FROM dbo.ifGetTotalItems(43659); |
For this example, we pass in 43659
the parameter value. The function then returns 26
. However, all we’ve so far is to demonstrate that the function works as we expect. Let’s create a more complex SELECT
statement that applies the function to each row returned from the SalesOrderHeader
table. That’s where the APPLY
operator comes in, as shown in the following example:
1 2 3 4 5 |
SELECT s.SalesOrderID, s.OrderDate, s.SalesPersonID, f.TotalItems FROM Sales.SalesOrderHeader s CROSS APPLY dbo.ifGetTotalItems(s.SalesOrderID) f ORDER BY SalesOrderID; |
Notice that after we specify the SalesOrderHeader
table in the FROM
clause, we then include the CROSS
APPLY
keywords, following by the function, with the SalesOrderID
column passed in as the parameter value.
The APPLY
operator takes two forms: CROSS
APPLY
and OUTER
APPLY
. The CROSS
APPLY
combination returns rows from the primary table (in this case, SalesOrderHeader
) only if they produce a result set from the table-valued function. The OUTER
APPLY
combination returns all rows from the primary table. In this case, both forms of APPLY
return the same number of rows. The following table shows a partial list of the results returned by this query:
SalesOrderID |
OrderDate |
SalesPersonID |
TotalItems |
43659 |
2005-07-01 00:00:00.000 |
279 |
26 |
43660 |
2005-07-01 00:00:00.000 |
279 |
2 |
43661 |
2005-07-01 00:00:00.000 |
282 |
38 |
43662 |
2005-07-01 00:00:00.000 |
282 |
54 |
43663 |
2005-07-01 00:00:00.000 |
276 |
1 |
43664 |
2005-07-01 00:00:00.000 |
280 |
14 |
43665 |
2005-07-01 00:00:00.000 |
283 |
20 |
43666 |
2005-07-01 00:00:00.000 |
276 |
7 |
43667 |
2005-07-01 00:00:00.000 |
277 |
6 |
43668 |
2005-07-01 00:00:00.000 |
282 |
93 |
43669 |
2005-07-01 00:00:00.000 |
283 |
1 |
43670 |
2005-07-01 00:00:00.000 |
275 |
6 |
43671 |
2005-07-01 00:00:00.000 |
283 |
17 |
43672 |
2005-07-01 00:00:00.000 |
282 |
9 |
43673 |
2005-07-01 00:00:00.000 |
275 |
20 |
As you can see, the results include the TotalItems
column, which is the number of items associated with that sale. In other words, we were able to apply the table-valued function to each row returned from the SalesOrderHeader
table. If our function had returned multiple rows, the result set would have included that number of rows for each row returned by the function. For example, if the function always returned three rows for each SalesOrderID
value, our resultset would include three times the number of rows than it currently does.
“I’ve heard you can use a function to parameterize a view, but I don’t see how you can incorporate a function into a view definition in such a way to support parameters. Can you explain how that is done?”
Using a function to parameterize a view has little to do with the view definition itself. It merely means you’re creating a function that either duplicates the logic of the view or calls the view within the function. In either case, you use a parameter to qualify the function’s SELECT
statement.
For example, suppose we create the following view to retrieve data about the number of employees per job title:
1 2 3 4 5 6 7 8 9 10 11 12 |
USE AdventureWorks2012; GO IF OBJECT_ID(N'dbo.JobData', N'V') IS NOT NULL DROP VIEW dbo.JobData; GO CREATE VIEW dbo.JobData WITH SCHEMABINDING AS SELECT JobTitle, COUNT(*) AS TotalEmps FROM HumanResources.Employee GROUP BY JobTitle; GO |
As you can see, the view is very straightforward. The SELECT
statement groups the data by the JobTitle
column and retrieves a count for each group. You can, of course, create a view that is far more complex than this one, but what we’ve done here is enough to demonstrate how this all works.
Once you’ve created the view, you can test it by running a simple SELECT
statement, similar to the following:
1 |
SELECT * FROM dbo.JobData; |
Not surprisingly, the statement returns all rows and columns returned by the view. The following table provides a partial list of those results.
JobTitle |
TotalEmps |
Accountant |
2 |
Accounts Manager |
1 |
Accounts Payable Specialist |
2 |
Accounts Receivable Specialist |
3 |
Application Specialist |
4 |
Assistant to the Chief Financial Officer |
1 |
Benefits Specialist |
1 |
Buyer |
9 |
Chief Executive Officer |
1 |
Chief Financial Officer |
1 |
Control Specialist |
2 |
Database Administrator |
2 |
Design Engineer |
3 |
Document Control Assistant |
2 |
Document Control Manager |
1 |
As with any SELECT
statement that retrieves data from a view, we can further refine our SELECT
statement by including the logic necessary to return the results we need, as shown in the following example:
1 2 |
SELECT TotalEmps FROM dbo.JobData WHERE JobTitle = 'Buyer'; |
In this case, we’ve merely specified a column in the SELECT
list and added a WHERE
clause that limits the results to those rows in which the JobTitle
value equals Buyer
. Now the statement returns only a value of 9
because that’s how many employees have that title.
Rather than creating a view and then qualifying the SELECT
statements that call the view, we can instead create a table-valued function that incorporates the view’s logic, but also provides the ability to qualify the results through a parameter, as shown in the following example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
USE AdventureWorks2012; GO IF OBJECT_ID(N'dbo.GetJobData', N'IF') IS NOT NULL DROP FUNCTION dbo.GetJobData; GO CREATE FUNCTION dbo.GetJobData (@title NVARCHAR(25)) RETURNS TABLE WITH SCHEMABINDING AS RETURN ( SELECT COUNT(*) AS TotalEmps FROM HumanResources.Employee WHERE JobTitle = @title GROUP BY JobTitle ); GO |
Notice that the function’s SELECT
statement is similar to that of the view’s except that we also include a WHERE
clause that compares the JobTitle
column to the @title
input parameter. When you call the function, you simply pass in the job title as an argument:
1 |
SELECT * FROM dbo.GetJobData('Buyer'); |
As to be expected, the SELECT
statement returns a value of 9
. Chances are, however, you’ll want to incorporate the function in a more complex query. In the following example, we use the APPLY
operator to join the Person
and Employee
tables to the GetJobData
function:
1 2 3 4 5 6 7 8 9 |
SELECT p.FirstName + ' ' + p.LastName AS FullName, e.JobTitle, f.TotalEmps FROM HumanResources.Employee e INNER JOIN Person.Person p ON e.BusinessEntityID = p.BusinessEntityID CROSS APPLY dbo.GetJobData(e.JobTitle) f; |
For each row returned from the joined Person
and Employee
tables, the GetJobData
function is applied to that row, based on the value of the JobTitle
column. The results will then include a TotalEmps
column, which will provide the total number of employees who share the same title as the person listed in that row. The following table provides a partial list of employees, their job titles, and the number of people who share that title.
FullName |
JobTitle |
TotalEmps |
Ken Sánchez |
Chief Executive Officer |
1 |
Roberto Tamburello |
Engineering Manager |
1 |
Rob Walters |
Senior Tool Designer |
2 |
Gail Erickson |
Design Engineer |
3 |
Jossef Goldberg |
Design Engineer |
3 |
Ovidiu Cracium |
Senior Tool Designer |
2 |
Thierry D’Hers |
Tool Designer |
2 |
Janice Galvin |
Tool Designer |
2 |
Michael Sullivan |
Senior Design Engineer |
1 |
Sharon Salavaria |
Design Engineer |
3 |
David Bradley |
Marketing Manager |
1 |
Kevin Brown |
Marketing Assistant |
3 |
John Wood |
Marketing Specialist |
5 |
Mary Dempsey |
Marketing Assistant |
3 |
Wanida Benshoof |
Marketing Assistant |
3 |
The point to all this is that we were able to encapsulate the logic of the view and do a lot more. One issue with this approach, however, is that we have two similar sets of code that would both need to be updated if the schema changed. If we want to avoid this scenario (and simplify our function in the process), we can call the view from within the function definition, as shown in the following example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
USE AdventureWorks2012; GO IF OBJECT_ID(N'dbo.GetJobData', N'IF') IS NOT NULL DROP FUNCTION dbo.GetJobData; GO CREATE FUNCTION dbo.GetJobData (@title NVARCHAR(25)) RETURNS TABLE WITH SCHEMABINDING AS RETURN ( SELECT TotalEmps FROM dbo.JobData WHERE JobTitle = @title ); GO |
This function achieves the results as the preceding example, but instead calls the view. That way, if the view changes in a way that does not affect the function’s SELECT
statement, the function will not need to be updated, and we’ve simplified the function’s code in the process. In that sense, we have truly parameterized the view.
“I’m creating a user-defined function and want to specify a default value for an input parameter. Is that possible?”
Yes, it is possible and fairly easy to do. When defining the parameter, include the default value, along with the equal sign, as shown the following example (highlighted):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
USE AdventureWorks2012; GO IF OBJECT_ID(N'dbo.GetPersonTypeCount', N'IF') IS NOT NULL DROP FUNCTION dbo.GetPersonTypeCount; GO CREATE FUNCTION dbo.GetPersonTypeCount (@type NCHAR(2) = 'IN') RETURNS TABLE WITH SCHEMABINDING AS RETURN ( SELECT COUNT(*) AS PersonCount FROM Person.Person WHERE PersonType = @type GROUP BY PersonType ); GO |
In this case, we’ve merely specified that the default value for the @type
parameter is IN
. However, if we want to use the default value when calling the view, we must specify the default
keyword as the parameter value, as shown in the following example:
1 |
SELECT * FROM dbo.GetPersonTypeCount(default); |
The function will now use the IN
default value and return a value of 18484
. That said, even though we’ve defined a default value, we can still specify that value when calling the function:
1 |
SELECT * FROM dbo.GetPersonTypeCount('in'); |
Or we can specify a different value, as we would if no default had been specified:
1 |
SELECT * FROM dbo.GetPersonTypeCount('em'); |
This time around, our SELECT
statement returns a value of 273
.
“I created a scalar function that’s used in multiple queries, some of which can return millions of rows. Ever since I incorporated the function into the queries, performance has nosedived. Are there steps we can take to fix this?”
Yes. Get rid of the function. Scalar functions are the bane of DBAs everywhere. The database engine has a habit of executing a scalar function for each row of data returned by the query. Even if a function can return only a dozen possible values, it might still run millions of times. You would think the query optimizer would be smarter than this. It is not.
If your query returns relatively few rows, a scalar function is usually no big deal and you can enjoy the encapsulation, parameterization, and ease of implementation that the function affords. But turn those few rows into millions and you’re suddenly faced with an abundance of extra executions, which translate into unnecessary disk I/O as well as hits on memory and processing resources. Plus, you can incur extra locking, which can further impact concurrency and performance.
To complicate matters, the estimated and actual execution plans that your queries generate are far from reliable. The true cost of the execution is often buried within the plan properties. You might even have to look to profile traces to get a more accurate picture of what the function is doing.
When possible, consider turning your scalar function into an inline table valued function. Such a function returns a table, rather than a single value (as is the case with a scalar function). However, you can create a table-valued function that returns only a single column with a single row, giving you results comparable to the scalar function. Best of all, the database engine normally executes the table-value function only once, regardless of the number of rows returned by the query, resulting in far better performance.
Assuming you can turn your scalar function into a table-value one, you would then incorporate the function into your query’s FROM
clause when retrieving the data, rather than in the SELECT
list. For example, suppose we were to start with a basic scalar function similar to the following:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
USE AdventureWorks2012; GO IF OBJECT_ID(N'dbo.fnGetTotalItems', N'FN') IS NOT NULL DROP FUNCTION dbo.fnGetTotalItems; GO CREATE FUNCTION dbo.fnGetTotalItems (@OrderID INT) RETURNS INT WITH SCHEMABINDING AS BEGIN DECLARE @TotalItems INT SELECT @TotalItems = SUM(OrderQty) FROM Sales.SalesOrderDetail WHERE SalesOrderID = @OrderID GROUP BY SalesOrderID RETURN @TotalItems; END; GO |
The function simply returns the total number of items associated with a sale in the SalesOrderDetail
table, based on the inputted SalesOrderID
value. You can then use a SELECT
statement to run the function, in this case, entering a SalesOrderID
value of 43659
:
1 |
SELECT dbo.fnGetTotalItems(43659); |
Based on the data in the sample database, the function returns a value of 26
. If you were to use the function in a slightly more complex query, it might look something like the following:
1 2 3 4 |
SELECT SalesOrderID, OrderDate, SalesPersonID, dbo.fnGetTotalItems(SalesOrderID) AS TotalItems FROM Sales.SalesOrderHeader ORDER BY SalesOrderID; |
This time around, we’re incorporating the function into the SELECT
list of a query retrieving data from the SalesOrderHeader
table. The following table shows a partial list of results returned by that query.
SalesOrderID |
OrderDate |
SalesPersonID |
TotalItems |
43659 |
2005-07-01 00:00:00.000 |
279 |
26 |
43660 |
2005-07-01 00:00:00.000 |
279 |
2 |
43661 |
2005-07-01 00:00:00.000 |
282 |
38 |
43662 |
2005-07-01 00:00:00.000 |
282 |
54 |
43663 |
2005-07-01 00:00:00.000 |
276 |
1 |
43664 |
2005-07-01 00:00:00.000 |
280 |
14 |
43665 |
2005-07-01 00:00:00.000 |
283 |
20 |
43666 |
2005-07-01 00:00:00.000 |
276 |
7 |
43667 |
2005-07-01 00:00:00.000 |
277 |
6 |
43668 |
2005-07-01 00:00:00.000 |
282 |
93 |
43669 |
2005-07-01 00:00:00.000 |
283 |
1 |
43670 |
2005-07-01 00:00:00.000 |
275 |
6 |
43671 |
2005-07-01 00:00:00.000 |
283 |
17 |
43672 |
2005-07-01 00:00:00.000 |
282 |
9 |
43673 |
2005-07-01 00:00:00.000 |
275 |
20 |
For each row returned, a column has been added that uses the fnGetTotalItems
function to calculate the total number of items sold for each order. As a result, the database engine calls the function for each row in the resultset. If we were returning millions of rows-or even billions!-all our operations could be impacted. For that reason, it’s often worth rewriting the function as a table-valued function that returns only one value:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
USE AdventureWorks2012; GO IF OBJECT_ID(N'dbo.ifGetTotalItems', N'IF') IS NOT NULL DROP FUNCTION dbo.ifGetTotalItems; GO CREATE FUNCTION dbo.ifGetTotalItems (@OrderID INT) RETURNS TABLE WITH SCHEMABINDING AS RETURN ( SELECT SUM(OrderQty) AS TotalItems FROM Sales.SalesOrderDetail WHERE SalesOrderID = @OrderID GROUP BY SalesOrderID ); GO |
The function again takes a single argument, the SalesOrderID
value, but this time returns the results as a table. That means, when we call the function, we must do so in a place in the query that accepts table expressions. For that, we can use the APPLY
operator to join the SalesOrderHeader
table to the function in the FROM
clause:
1 2 3 4 5 |
SELECT s.SalesOrderID, s.OrderDate, s.SalesPersonID, f.TotalItems FROM Sales.SalesOrderHeader s CROSS APPLY dbo.ifGetTotalItems(s.SalesOrderID) f ORDER BY SalesOrderID; |
The statement returns the same results as the previous SELECT
statement; only this time, the database engine usually calls the function only once and then applies it to each row.
You might not always be able to easily turn your scalar function into a table-value function, in which case, you’ll probably want to take another approach. You might, for example, create a stored procedure or simply create the necessary T-SQL, without encapsulating any of the logic. Whatever approach you take, you should test and compare your queries under realistic workloads and then determine which approach is best suited to your circumstances.
“I want to call the GETDATE system function from within a user-defined function, but I’ve read you cannot do this. Is there a workaround that let’s me use the GETDATE function?”
First off, despite the plethora of articles and blog posts that state you cannot use a nondeterministic function such as GETDATE
within a user-defined function, it is simply not true, at least not in SQL Server 2008 R2 and SQL Server 2012, and from what I can tell, this has been the case since SQL Server 2005. For proof, check out the TechNet article “User-Defined Functions,” or try it out yourself. You’ll find that some nondeterministic functions, such as GETDATE
, can indeed be called from within a user-defined function.
But first a step back to explain what we mean to deterministic and nondeterministic functions. A deterministic function is one that always returns the same results when given the same specific set of input values. For example, the SQRT
system function will always return the same square root value of the inputted number if that number is always the same.
However, a nondeterministic function will not necessarily return the same results each time it runs, even if input values are the same. For example, the GETDATE
, HOST_ID
and NEWID
system functions might return different results each time they’re called; therefore, they’re considered nondeterministic functions.
For quite a few years, the conventional wisdom has been that SQL Server does not let you call a nondeterministic function from within a user-defined function. However, the following CREATE
FUNCTION
statement will run with no problem:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
USE AdventureWorks2012; GO IF OBJECT_ID(N'dbo.GetJobData', N'IF') IS NOT NULL DROP FUNCTION dbo.GetJobData; GO CREATE FUNCTION dbo.GetJobData (@title NVARCHAR(25), @HireDate DATETIME) RETURNS TABLE WITH SCHEMABINDING AS RETURN ( SELECT COUNT(*) AS TotalEmps FROM HumanResources.Employee WHERE JobTitle = @title AND HireDate BETWEEN @HireDate AND DATEADD(MM, -6, GETDATE()) GROUP BY JobTitle ); GO |
The function determines the number of employees with the specified job title who have been hired between the specified date and current date, less six months. The function uses the GETDATE
function to determine that six-month window.
Despite what you might have heard, SQL Server will create the function. Even when I included the HOST_ID
function within the definition (just to test things out), SQL Server created the function. However, when I tried to incorporate the NEWID
, RAND
, TEXTPTR
or NEWSEQUENTIALID
nondeterministic functions into the function definition, I received an error message.
It turns out that SQL Server supports only some nondeterministic functions, such as GETDATE
and HOST_ID
. However, to be sure that GETDATE
works correctly in our function, we can test it by calling the function and passing in the necessary parameter values:
1 |
SELECT * FROM dbo.GetJobData('Buyer', '2004-01-01'); |
In this case, the function returns only seven employees with the title of Buyer who have been hired within the specified date range. We can test the function further by using the APPLY
operator to join the Employee
and Person
tables to the function, as in the following example:
1 2 3 4 5 6 7 8 9 10 |
SELECT p.FirstName + ' ' + p.LastName AS FullName, e.JobTitle, f.TotalEmps FROM HumanResources.Employee e INNER JOIN Person.Person p ON e.BusinessEntityID = p.BusinessEntityID CROSS APPLY dbo.GetJobData(e.JobTitle, e.HireDate) f ORDER BY JobTitle; |
For each row returned from the joined tables, the function is applied, based on the JobTitle
and HireDate
values. The following table shows a partial list of results returned by the query.
FullName |
JobTitle |
TotalEmps |
Barbara Moreland |
Accountant |
2 |
Mike Seamans |
Accountant |
1 |
David Liu |
Accounts Manager |
1 |
Karen Berg |
Application Specialist |
1 |
Ramesh Meyyappan |
Application Specialist |
2 |
Dan Bacon |
Application Specialist |
3 |
Janaina Bueno |
Application Specialist |
4 |
Mindy Martin |
Benefits Specialist |
1 |
Mikael Sandberg |
Buyer |
9 |
Arvind Rao |
Buyer |
8 |
Linda Meisner |
Buyer |
7 |
Fukiko Ogisu |
Buyer |
6 |
Gordon Hee |
Buyer |
5 |
Frank Pellow |
Buyer |
4 |
Eric Kurjan |
Buyer |
3 |
Erin Hagens |
Buyer |
2 |
Ben Miller |
Buyer |
1 |
Ken Sánchez |
Chief Executive Officer |
1 |
Laura Norman |
Chief Financial Officer |
1 |
Notice how, for the Buyer job title, the number of total employees decreases with each row. This is because the date range is shrinking with each row, as the hire date is compared to the GETDATE
value (minus the six months). The point is, the GETDATE
function, a built-in nondeterministic function, works fine in our user-defined function. So don’t believe everything you read.
In cases when you want to use a nondeterministic function that is not permitted, such as NEWID
, RAND
, TEXTPTR
or NEWSEQUENTIALID
, you can get around this limitation by calling the system function within a view and then calling the view from within your user-defined function. Or instead consider creating a stored procedure that achieves what your after or just going with straight T-SQL and not encapsulating the logic.
“I want to execute a stored procedure from within a user-defined function, but I keep running into errors in the syntax. How do I call a stored procedure from within a function?”
The reason you’re receiving errors is because, technically, you cannot call a stored procedure from within a user-defined function, unless it is an extended stored procedure. But even there, time is running out. Extended stored procedures have been deprecated and will eventually disappear from the SQL Server landscape.
That said, you’ll find plenty of online articles and blog posts suggesting methods for bypassing SQL Server’s built-in limitation against calling stored procedures within a function. One suggested workaround is to use an xp_cmdshell
command to run a batch file that executes the stored procedure. Another suggestion is to use the OPENQUERY
built-in function to connect via a linked server and then call the stored procedure.
In either case, you’re essentially creating a second process to trick SQL Server into running the stored procedure. The problem with such an approach is that you can end up tricking other SQL Server components as well, and end up in a deadlock that SQL Server cannot resolve.
SQL Server functions are specifically designed to prevent them from being able to change the state of the database in any way. Because stored procedures can make such changes, you cannot run them from within a function, just like you cannot modify the schema or the stored data. Functions essentially read data and then usually, in some way, manipulate the read data. But a function should never impact the underlying schema or data. That means you cannot create a function that inadvertently corrupts your entire database. If you need a routine to run a batch that includes stored procedures, or that modifies data or metadata, you need a stored procedure.
“When creating user-defined functions, I like to include the SCHEMABINDING option, but in some cases, I receive an error when creating the function, saying that I cannot schema bind the function. Any idea what’s going on?”
My guess is that you’re referencing a view or user-defined function within your function that is itself not schema bound. You cannot schema bind a function if the referenced object isn’t bound.
Schema binding ensures that the function you’re creating is bound to the database objects that it references. When you specify the SCHEM
ABINDING
option, no one can modify the base object in a way that would affect the function definition. The function must first be modified or dropped to remove any dependencies before the underlying objects can be changed.
However, when your function definition references a view or other user-defined function, that object must also be schema bound before you can apply the SCHEMABINDIN
G
function to your new function, otherwise, you’re not really protecting the new function from underlying schema changes.
Let’s look at an example to get a sense of how this works. Suppose we create a basic table-valued function similar to the one shown in the following example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
USE AdventureWorks2012; GO IF OBJECT_ID(N'dbo.GetPersonTypeCount', N'IF') IS NOT NULL DROP FUNCTION dbo.GetPersonTypeCount; GO CREATE FUNCTION dbo.GetPersonTypeCount (@type NCHAR(2)) RETURNS TABLE WITH SCHEMABINDING AS RETURN ( SELECT COUNT(*) AS PersonCount FROM Person.Person WHERE PersonType = @type GROUP BY PersonType ); GO |
Notice that the function’s SELECT
statement references only the Person
table and its columns. You can include the SCHEMABINDING
option in the WITH
clause with no problem because you’re directly referencing the database objects (the table and columns). If we were to call the function, as in the following example, it would return the number of people associated with a specific type, in this case, in
.
1 |
SELECT * FROM dbo.GetPersonTypeCount('in'); |
The SELECT
statement returns a value of 18
484
. We can achieve the same results be creating the following view and then qualifying our SELECT
statement when we call the view:
1 2 3 4 5 6 7 8 9 10 11 |
USE AdventureWorks2012; GO IF OBJECT_ID(N'dbo.PersonTypeCount', N'V') IS NOT NULL DROP VIEW dbo.PersonTypeCount; GO CREATE VIEW dbo.PersonTypeCount AS SELECT PersonType, COUNT(*) AS PersonCount FROM Person.Person GROUP BY PersonType; GO |
To return the same results as the preceding SELECT
statement, we must include a WHERE
clause to specify the person type, as shown in the following example:
1 2 |
SELECT PersonCount FROM dbo.PersonTypeCount WHERE PersonType = 'in'; |
Suppose we now want to update our function to reference the view, rather than include its own object. This allows us to pass in a parameter without having to define a WHERE
clause each time we call the view. The function does it for us, as shown in the following definition:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
USE AdventureWorks2012; GO IF OBJECT_ID(N'dbo.GetPersonTypeCount', N'IF') IS NOT NULL DROP FUNCTION dbo.GetPersonTypeCount; GO CREATE FUNCTION dbo.GetPersonTypeCount (@type NCHAR(2)) RETURNS TABLE AS RETURN ( SELECT PersonCount FROM dbo.PersonTypeCount WHERE PersonType = @type ); GO |
Notice in this function definition we do not include the WITH
SCHEMABINDING
clause, as we did the first time we created the function. If we include it, we’ll receive an error message saying that we cannot schema bind the function. We would either have to remove the clause or re-create the view to include the WITH
SCHEMABINDING
clause, as is the following example:
1 2 3 4 5 6 7 8 9 10 11 12 |
USE AdventureWorks2012; GO IF OBJECT_ID(N'dbo.PersonTypeCount', N'V') IS NOT NULL DROP VIEW dbo.PersonTypeCount; GO CREATE VIEW dbo.PersonTypeCount WITH SCHEMABINDING AS SELECT PersonType, COUNT(*) AS PersonCount FROM Person.Person GROUP BY PersonType; GO |
After we re-create the view, we can then run our function definition again, but this time with the WITH
SCHEMABINDING
clause:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
USE AdventureWorks2012; GO IF OBJECT_ID(N'dbo.GetPersonTypeCount', N'IF') IS NOT NULL DROP FUNCTION dbo.GetPersonTypeCount; GO CREATE FUNCTION dbo.GetPersonTypeCount (@type NCHAR(2)) RETURNS TABLE WITH SCHEMABINDING AS RETURN ( SELECT PersonCount FROM dbo.PersonTypeCount WHERE PersonType = @type ); GO |
Because we made the view schema bound, we can now do the same with the function. As a result, no schema changes can be made that will affect either the view or the function.
“I’ve seen user-defined functions called in different ways in code. Is there a proper way to call a function?”
The ways in which you can call a user-defined function depends on whether it is a scalar function or a table-valued function. For example, suppose we have the following scalar function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
USE AdventureWorks2012; GO IF OBJECT_ID(N'dbo.fnGetTotalItems', N'FN') IS NOT NULL DROP FUNCTION dbo.fnGetTotalItems; GO CREATE FUNCTION dbo.fnGetTotalItems (@OrderID INT) RETURNS INT WITH SCHEMABINDING AS BEGIN DECLARE @TotalItems INT SELECT @TotalItems = SUM(OrderQty) FROM Sales.SalesOrderDetail WHERE SalesOrderID = @OrderID GROUP BY SalesOrderID RETURN @TotalItems; END; GO |
The function takes an input parameter based on the SalesOrderID
column in the SalesOrderDetail
table and returns a total number of items associated with that sale. You can call the function within a column expression in your query’s SELECT
, WHERE
, or HAVING
clause. For example, that query can be something as simple as the following:
1 |
SELECT dbo.fnGetTotalItems(43659); |
In this case, the query returns a value of 26
, which means order 43659 has 26 items associated with it. The column expression in this case is simply the function and its parameter value. However, we can create a query slightly more complex:
1 2 3 4 |
SELECT SalesOrderID, OrderDate, SalesPersonID, dbo.fnGetTotalItems(SalesOrderID) AS TotalItems FROM Sales.SalesOrderHeader ORDER BY SalesOrderID; |
Once again, the function is included as part of a column expression in the SELECT
list. Only this time, we’ve assigned an alias to the column and specified the SalesOrderID
column as the parameter’s value. The following table provides a partial list of the values returned by this statement.
SalesOrderID |
OrderDate |
SalesPersonID |
TotalItems |
43659 |
2005-07-01 00:00:00.000 |
279 |
26 |
43660 |
2005-07-01 00:00:00.000 |
279 |
2 |
43661 |
2005-07-01 00:00:00.000 |
282 |
38 |
43662 |
2005-07-01 00:00:00.000 |
282 |
54 |
43663 |
2005-07-01 00:00:00.000 |
276 |
1 |
43664 |
2005-07-01 00:00:00.000 |
280 |
14 |
43665 |
2005-07-01 00:00:00.000 |
283 |
20 |
43666 |
2005-07-01 00:00:00.000 |
276 |
7 |
43667 |
2005-07-01 00:00:00.000 |
277 |
6 |
43668 |
2005-07-01 00:00:00.000 |
282 |
93 |
43669 |
2005-07-01 00:00:00.000 |
283 |
1 |
43670 |
2005-07-01 00:00:00.000 |
275 |
6 |
43671 |
2005-07-01 00:00:00.000 |
283 |
17 |
43672 |
2005-07-01 00:00:00.000 |
282 |
9 |
43673 |
2005-07-01 00:00:00.000 |
275 |
20 |
Despite the differences between the preceding two SELECT
statements, they’re essentially doing the same thing, in terms of calling the function as part of a column expression. However, SQL Server also lets you call a scalar function within an EXECUTE
statement, as shown in the following example:
1 2 3 4 |
DECLARE @OrderQty INT; SET @OrderQty = NULL; EXEC @OrderQty = dbo.fnGetTotalItems @OrderID = 43659; SELECT @OrderQty; |
Like our first SELECT
statement, the T-SQL here returns a value of 26
; however, the way we get there is a bit more complicated in this example. First we declare the @
OrderQty
variable, set its value to NULL
, and then use the EXECUTE
statement to call the function and assign the returned value to the variable. Finally, we use a SELECT
statement to retrieve the variable value.
Not surprisingly, the ways in which we can call a table-valued function are different because the function returns a full resultset (table). That means we must treat the function similar to how we would treat a table or view in our queries. Let’s look at an example to see how this works. The following T-SQL creates a table-valued function that contains logic similar to our previous example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
USE AdventureWorks2012; GO IF OBJECT_ID(N'dbo.ifGetTotalItems', N'IF') IS NOT NULL DROP FUNCTION dbo.ifGetTotalItems; GO CREATE FUNCTION dbo.ifGetTotalItems (@OrderID INT) RETURNS TABLE WITH SCHEMABINDING AS RETURN ( SELECT SUM(OrderQty) AS TotalItems FROM Sales.SalesOrderDetail WHERE SalesOrderID = @OrderID GROUP BY SalesOrderID ); GO |
When calling the function, we must remember that we’re dealing with a table-like resultset. The simplest way to do this is to include the function in our query’s FROM
clause, where a table expression is expected:
1 |
SELECT TotalItems FROM dbo.ifGetTotalItems(43659); |
Once again, our SELECT
statement returns a value of 26
. However, if the function had returned more rows and more columns, those would be included as well. In this case, the SELECT
statement returns whatever results the function generates.
When working with table-valued functions, you can also use the APPLY
operator to join a table to the function, as shown in the following example:
1 2 3 4 5 |
SELECT s.SalesOrderID, s.OrderDate, s.SalesPersonID, f.TotalItems FROM Sales.SalesOrderHeader s CROSS APPLY dbo.ifGetTotalItems(s.SalesOrderID) f ORDER BY SalesOrderID; |
This time around, we’re calling the function in the FROM
clause (by joining it to the table) and including the function’s returned value (TotalItems
) in the SELECT
list. The SELECT
statement will then return the same results shown with the preceding scalar function example. Be aware, however; you cannot use an EXECUTE
statement to call a table-valued function. That is reserved for scalar functions only.
“I’m working on a query whose T-SQL code I want to encapsulate and parameterize. The query will aggregate a subset of values based on an input parameter and return the aggregated value as a column in the query results. Should I create a stored procedure or a user-defined function?”
Based on what you’re describing, a user-defined function is likely the way to go. SQL Server lets you call a function from within a SELECT
statement, which you cannot do with a stored procedure. However, both provide a mechanism for encapsulating your T-SQL and passing in parameters. With stored procedures, though, you can define both input and output parameters. Functions are limited to input parameters. In addition, a function must return a single result, either a scalar value or a table. Stored procedures can return a single result, multiple results, or no results.
Another difference is that you can include all sorts of T-SQL statements within a stored procedure, which means you can retrieve data, modify data, create tables, delete tables, or take a variety of other actions. In a user-defined function, you’re basically limited to retrieving data, along with creating and managing local variables and calling extended stored procedures (which have been deprecated and will one day disappear altogether). Basically, you cannot take any action in a function that would modify the database state.
Also worth nothing is that you can call a function from within a stored procedure, but not the other way around (at least not without some clunky and often risky workarounds). Stored Procedures also let you incorporate TRY...CATCH
error handling into your code. Functions do not.
It’s not all bad news for functions, however. One of their biggest advantages, as noted above, is the ability to call them from within your SELECT
statements. Not only can you use functions in the SELECT
list, but also within your WHERE
and HAVING
clauses. In addition, you can join a table-valued function in your FROM
clause to one or more tables, usually with the help of the APPLY
operator. Plus, you can use functions in your computed column and CHECK
constraint definitions. Try doing that with a stored procedure.
Load comments